The code chunk below sets up some global options for the document. Nothing to do here, but you can learn a trick on how to navigate the document very quickly. See that the code chunk below starts out with this “r setup, include=FALSE.” If you look in the very bottom, left corner of the Source pane, you will see an orange hash tag and to the right of it, a dropdown menu. The dropdown menu is automatically created from headings in your document and from the names of the code chunks. The code chunk below is called “setup” because we told the computer that by typing “r setup.” The “r” indicates the computer should execute the code in the R language, and the word “setup” is an optional parameter we can include to name the code chunk. The “include=FALSE” part is called a code chunk option, and this particular example is us telling the computer to not show the code in a knit document, but to execute the code.
Hit the run button on the code below.
Optional note: There are technical differences between packages and libraries that are not important for the beginner to know. When starting out in R, imagine that packages are bundles of code (e.g. little packages) that contain related functions (code that does specific tasks). There many packages available for R, and you can create one if you need to make something very custom. However for the most part, anything commonly done has a package. A quick Google search to find an appropriate package(s) for your task is often the best place to start. Packages can be easily downloaded on to your computer with very little fuss. I have already pre-installed the ones needed for this document. You might use different packages in different projects, and so when you are creating a new R Markdown document, you will need to load the package by calling the library function like below. In short, you install a package once (though you might have to update it from time-to-time) but you need to call it every time you use it for the first time in code.
library(readxl)
library(kableExtra)
library(gridExtra)
library(ggsci)
library(extrafont)
library(qcc)
library(SixSigma)
library(runcharter)
library(tidyverse)
library(plotly)
Ok let’s make some plots for your potential QI project!
Step 1, press the run button at the top of the code chunk and see the resulting image.
cause.and.effect(cause=list(
People = c("Order illegible", "Phone unanswered", "Heavy Workload"),
Environment = c("Transcription Error", "Rounding"),
Materials = c("Out-of-stock", "Spoiled"),
Methods = c("Too many people", "Lab Handling"),
Equipment = c("Speed", "Broken Pager", "Phone Capacity")),
effect="Long \ntest \nresults \ntime")
Here’s your first task. Change the code below to make a cause and effect diagram with your own inputs. A simple change to help you get started; find the place in the code below that looks like Materials = c("Out-of-stock", "Spoiled") and change the “Out-of-stock” and “Spoiled” to something else. Run the code chunk and see your change in action.
cause.and.effect(cause=list(
People = c("Order illegible", "Phone unanswered", "Heavy Workload"),
Environment = c("Transcription Error", "Rounding"),
Materials = c("Out-of-stock", "Spoiled"),
Methods = c("Too many people", "Lab Handling"),
Equipment = c("Speed", "Broken Pager", "Phone Capacity")),
effect="Long \ntest \nresults \ntime")
The data for the histogram are generated from code in R that creates simulated data based on some different parameters. This is a good way to make the graphs you want before you have the real data for your project. It also might be helpful when you are deciding whether or not to do a project because if you simulate the best possible results, and they are not “good enough” for the effort, you might choose to tackle a different issue or a different part of the problem.
Your next task is to:
Run the code as is and examine the output
Change the names of the clinics in the simulated data. Currently they are called Cardiology Clinic and Endocrinology Clinic.
Make up a clinic campus name and replace XYZ with a campus name of your choice. Hint: Control + F, works to search a document.
Delete the hash before the line of code theme_bw()
# Data generation
clinic_1 <- "Cardiology Clinic"
clinic_2 <- "Endocrinology Clinic"
pre_cards <- tibble(days = round(runif(100, 0, 21)),
intervention = "Before Intervention",
clinic = clinic_1)
post_cards <- tibble(days = round(runif(100, 0, 14)),
intervention = "After Intervention",
clinic = clinic_1)
pre_gi <- tibble(days = round(runif(100, 0, 20)),
intervention = "Before Intervention",
clinic = clinic_2)
post_gi <- tibble(days = round(runif(100, 0, 15)),
intervention = "After Intervention",
clinic = clinic_2)
#Processing the data
data_histo <- bind_rows(pre_cards, post_cards, pre_gi, post_gi)
data_histo$intervention <- as_factor(data_histo$intervention)
data_histo$clinic <- as_factor(data_histo$clinic)
#Plotting the data
plot_data_histo <- ggplot(data_histo, aes(days, fill = clinic)) +
geom_histogram(binwidth = 1, color = 'white') +
labs(title = "Days between date of encounter and note signed",
subtitle = "Adult Patients at XYZ Campus",
y = "Number of Notes",
x = "Days",
caption="Simulated Data") +
scale_y_continuous(breaks = c(0, 4, 8, 12, 16, 20, 24)) +
facet_grid(intervention ~ clinic) +
scale_color_npg(palette = "nrc")+
scale_fill_npg(palette = "nrc") +
# theme_bw() +
theme(legend.position = "none") +
theme(text=element_text(family = "serif"))
#Showing the plot
plot_data_histo
Bonus Just run this to see an example of the same plot but with interactive features.
ggplotly(plot_data_histo)
If you already have data from a project you can read it into R and manipulate it without modifying any of the data in the original files. Many different file formats can be read into R, but here I am using Excel spreadsheet that is stored in this project. All data is simulated, fake, or reproductions of publicly available data. Note: There is an excellent package called googlesheets4 that makes it easy to import data stored in a Google Sheet into R.
Your tasks:
patero or fmea? Hit the run button on the code chunk below. Now look at the Environment panel again. Are they there now?fake_data <- read_excel("./data/qi_spreadsheet_workshop.xlsx",
sheet = "scatter")
pareto <- read_excel("./data/qi_spreadsheet_workshop.xlsx",
sheet = "Pareto")
fmea <- read_excel("./data/qi_spreadsheet_workshop.xlsx",
sheet = "fmea")
Your task now is to play around with the kable_styling() function to change the appearance of the table. Run the code, and look at the table. Now look at options already in the code or use the help search bar to look for kable_styling or type into the console ?kable_styling. Hint: Do something simple like change the font size.
fmea %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"),
font_size = 7)
| Steps in Process | Failure Mode | Failure Cases | Failure Effects | Likelihood of Occurrence | Likelihood of Detection | Severity | Risk Profile Number | Actions To Reduce Occurrence of Failure |
|---|---|---|---|---|---|---|---|---|
| Orders are written for new medications. | The first dose may be given prior to pharmacist review of the orders. | Medication ordered may be available and easily accessed in the dispensing machine. | Patient may receive incorrect medication, incorrect dose, or a dose via incorrect route. | 6 | 5 | 1 | 30 | Assign clinical pharmacists to patient care units so that all medication orders can be reviewed as they occur. |
| Orders are written to discontinue a medication or change the existing order. | Orders are written to discontinue a medication or change the existing order. | All doses needed for a 24-hour period are delivered to the drawer. Drawer is not changed until next routine delivery. 24-hour supply of refrigerated medications is delivered. Multi-dose vials may be kept in the patientspecific drawer. Medications are available in dispensing machine. | Patients may receive medications that have been discontinued or the incorrect dose of a medication that has been changed. | 10 | 5 | 5 | 250 | Schedule pick-ups of discontinued medications, including refrigerated medications, twice per day. Use dispensing machine screen to verify all information regarding current and discontinued medications prior to each administration. |
| Orders are written for a non-standard dose of a medication. | Nursing staff may prepare an incorrect dose when manipulating the medication. | Staff prepare the dose using medications from the dispensing machine and manipulate them to get the dose ordered. | Patient may receive an incorrect dose. | 3 | 5 | 4 | 60 | Prepare all nonstandard doses in the pharmacy and dispense each as a patient-specific unit dose. |
Run the code button and look at the graph.
Add the very last part of the code chunk, delete the hash, and then after the + sign add the theme function of your choice: theme_bw(),theme_dark(), theme_classic()
Rerun the code chunk and look at the two graphs.
plot_pareto <-
ggplot(pareto, aes(x = reorder(`Error Type`,-`Frequency`),
y = `Frequency`)) +
geom_col() +
labs(
title = "Types of Errors Discovered During Surgical Set-up",
subtitle = "Pareto Chart",
x = " ",
y = "Frequency",
caption = "Source data from IHI QI Toolkits"
) +
scale_x_discrete(
labels = function(x)
str_wrap(x, width = 10)
) +
annotate("text",
x = 2,
y = 75,
label = "Vital Few") +
annotate(
"pointrange",
x = 2,
y = 70,
xmin = 1,
xmax = 3,
colour = "black",
size = 1
) +
annotate("text",
x = 6,
y = 40,
label = "Useful many") +
annotate(
"pointrange",
x = 6,
y = 35,
xmin = 4,
xmax = 8,
colour = "black",
size = 1
)
plot_pareto
#plot_pareto +
Run the code below and inspect the plot.
Replace the current function geom_jitter() with geom_point().
Re-run the code. What differences do you see? Why might you want to use geom_jitter() instead of geom_point?
scatter_plot <-
ggplot(fake_data, aes(x = count, y = time, color = resident)) +
#Hint!
geom_jitter() +
labs(
title = "Average Time from Admission Order to Order Reconciliation Completed",
subtitle = "Academic Year 2012-2013 \nPGY-2 Residents",
caption = "Simulated Data",
x = "Number of Residents on Service",
y = "Average Time (Mins)"
) +
scale_x_continuous(breaks = (1:10)) +
scale_color_jama() +
scale_fill_lancet() +
theme_bw() +
scale_color_discrete(name = "Service Type") +
theme(
legend.justification = c(1, 0),
legend.position = c(0.95, 0.4),
legend.box.background = element_rect(color = "black")
) +
theme(text = element_text(family = "serif"))
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.
scatter_plot
Here is a great, short talk about Run Charts and background knowledge specific to R by the package author of runcharter.
Let’s look at the data that comes with the package.
FYI: Dealing with time series data can get very tricky, very quickly.
First, run this code chunk. It will load some data, and show you a table of the first 10 rows the data. What are the columns in your table?
signals <- runcharter::signals
head(signals, 10) %>%
kable() %>%
kable_styling()
| grp | y | date |
|---|---|---|
| WardX | 9 | 2014-01-01 |
| WardX | 22 | 2014-02-01 |
| WardX | 19 | 2014-03-01 |
| WardX | 18 | 2014-04-01 |
| WardX | 8 | 2014-05-01 |
| WardX | 7 | 2014-06-01 |
| WardX | 11 | 2014-07-01 |
| WardX | 11 | 2014-08-01 |
| WardX | 11 | 2014-09-01 |
| WardX | 12 | 2014-10-01 |
Pretend that the column called “grp” are the units of hospital you are interested in studying. Then “y” is the outcome of interest, and “date” is the date of the measurement.
Let’s rename the columns to something more clear for the reader.
In the code chunk below:
Replace the "grp" with a better column header.
Replace the "y" with a more descriptive header.
Run the code again to see the new column names.
my_groups <- "grp"
my_outcome <- "y"
head(signals, 10) %>%
kable(col.names = c(my_groups,
my_outcome,
"Dates")) %>%
kable_styling()
| grp | y | Dates |
|---|---|---|
| WardX | 9 | 2014-01-01 |
| WardX | 22 | 2014-02-01 |
| WardX | 19 | 2014-03-01 |
| WardX | 18 | 2014-04-01 |
| WardX | 8 | 2014-05-01 |
| WardX | 7 | 2014-06-01 |
| WardX | 11 | 2014-07-01 |
| WardX | 11 | 2014-08-01 |
| WardX | 11 | 2014-09-01 |
| WardX | 12 | 2014-10-01 |
Now let’s make the plot that came with the package documentation. Run the code chunk
runchart_1 <- signals %>%
runcharter(med_rows = 7,
runlength = 5,
direction = "both",
datecol = date,
grpvar = grp,
yval = y,
chart_title = "Runs in both directions",
chart_subtitle = "Runs of 5, from median calculated over first 7 data points in each location")
runchart_1$runchart
Let’s just look at WardX. Run the code below.
wardx <- signals %>% filter(grp == "WardX")
runchart_2 <- wardx %>%
runcharter(med_rows = 7,
runlength = 5,
direction = "both",
datecol = date,
grpvar = grp,
yval = y,
chart_title = "Runs in both directions",
chart_subtitle = "Runs of 5, from median calculated over first 7 data points in each location")
runchart_2$runchart
Locate the qi_workshop.bib file in the File pane. Open it up and take a peek so the following will make more sense.
Keeping track of your citations and sources can be hard. There are many options, but the workflow I will share is how to cite using a .bib file and a DOI. My standard practice is using Zotero which has nice integration with RStudio with no extra effort on the user’s part.
Here’s me citing the the paper1 that inspired me to make this workshop; it is already in the qi_workshop.bib file within this project.
Here is a paper2 I semi-randomly found on pubmed.gov examining selective oxidation of things in raw sugar cane. I can cite this using a DOI and then it automatically flows into my qi_workshop.bib file!!!
When citing things in an .Rmd have (at least) three options.
[@key] where the key is replaced with the entry, highlighted in blue below, from your my_generic_citations.bib file.Knit the document and take a look at the result.
If you use RStudio, you should use projects. I have set this all up for you here, but in general you should use it to manage your data and make it easy to find your data and files and be able to share your code because you will be using relative paths and not accidentally using absolute paths. Don’t be overwhelmed by this, but here is more reading for your enjoyment later.
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
## LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
##
## locale:
## [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
## [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
## [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] plotly_4.9.3 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.5
## [5] purrr_0.3.4 readr_1.4.0 tidyr_1.1.3 tibble_3.1.1
## [9] ggplot2_3.3.3 tidyverse_1.3.1 runcharter_0.2.0 SixSigma_0.9-52
## [13] qcc_2.7 extrafont_0.17 ggsci_2.9 gridExtra_2.3
## [17] kableExtra_1.3.4 readxl_1.3.1
##
## loaded via a namespace (and not attached):
## [1] httr_1.4.2 jsonlite_1.7.2 viridisLite_0.4.0 modelr_0.1.8
## [5] assertthat_0.2.1 highr_0.8 cellranger_1.1.0 yaml_2.2.1
## [9] Rttf2pt1_1.3.8 pillar_1.6.0 backports_1.2.1 lattice_0.20-41
## [13] glue_1.4.2 extrafontdb_1.0 digest_0.6.27 rvest_1.0.0
## [17] colorspace_2.0-0 htmltools_0.5.1.1 pkgconfig_2.0.3 broom_0.7.6
## [21] haven_2.4.0 xtable_1.8-4 scales_1.1.1 webshot_0.5.2
## [25] svglite_2.0.0 farver_2.1.0 generics_0.1.0 ellipsis_0.3.1
## [29] withr_2.4.2 lazyeval_0.2.2 cli_2.4.0 magrittr_2.0.1
## [33] crayon_1.4.1 evaluate_0.14 fs_1.5.0 fansi_0.4.2
## [37] MASS_7.3-53 xml2_1.3.2 tools_4.0.3 data.table_1.14.0
## [41] hms_1.0.0 lifecycle_1.0.0 munsell_0.5.0 reprex_2.0.0
## [45] compiler_4.0.3 systemfonts_1.0.1 rlang_0.4.10 grid_4.0.3
## [49] rstudioapi_0.13 htmlwidgets_1.5.3 crosstalk_1.1.1 labeling_0.4.2
## [53] rmarkdown_2.7 testthat_3.0.2 gtable_0.3.0 DBI_1.1.1
## [57] R6_2.5.0 zoo_1.8-9 lubridate_1.7.10 knitr_1.32
## [61] utf8_1.2.1 stringi_1.5.3 Rcpp_1.0.6 vctrs_0.3.7
## [65] dbplyr_2.1.1 tidyselect_1.1.0 xfun_0.22